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Sequence-Dependent Effects on the Properties of Semiflexible Biopolymers 
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Using path integral technique, we show exactly that for a semiflexible biopolymer in constant 
extension ensemble, no matter how long the polymer and how large the external force, the effects 
of short range correlations in the sequence-dependent spontaneous curvatures and torsions can be 
incorporated into a model with well-defined mean spontaneous curvature and torsion as well as 
a renormalized persistence length. Moreover, for a long biopolymer with large mean persistence 
length, the sequence-dependent persistence lengths can be replaced by their mean. However, for a 
short biopolymer or for a biopolymer with small persistence lengths, inhomogeneity in persistence 
lengths tends to make physical observables very sensitive to details and therefore less predictable. 

PACS numbers: 87.15.-v, 87.10.Pq, 36.20.Ey, 87.15.A- 



I. INTRODUCTION 



The conformal and mechanical properties of double- 
stranded DNA (dsDNA) have attracted considerable at- 
tention due to the central role that dsDNA plays in bio- 
logical processes. Recent progresses in experimental tech- 
niques such as laser or magnetic tweezers, atomic force 
microscopy, and other single molecule techniques make 
it possible to manipulate and observe single biomolecules 
directly flSii, allowing better comparisons between 
theoretical predictions and experimental observations. In 
theoretical studies, a semiflexible biopolymer is often 
modelled as a filament. The simplest model for a fila- 
ment, called the wormlike chain (WLC) model, views the 
filament as an inextensible continuous chain with a uni- 
form bending rigidity but with vanishing cross section, 
and has been successfully applied to the entropic elas- 
ticity of dsDNA d i, 0, B • Furthermore, the wormlike 
rod chain (WLRC) model which regards the filament as 
a chain with spontaneous twist and a finite circular cross 
section, has been used to explain the supercoiling prop- 
erty of dsDNA [1, 0, B @ • Owing to the importance of 
DNA, recently there has been a lot of theoretical work 
on the WLC and WLRC models as well as their modifi- 
cations and extensionsJES, 0, [IJM [13, HHj dj IHj 

m, K [13, E [M US iinil, Mmm- 

Traditional models of filaments are essentially homo- 
geneous. In other words, these models are defined by s- 
independent parameters, where s is the arclength. How- 
ever, biopolymers are often sequence-dependent and so 
are heterogeneous. Several recent works have revealed 
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that the sequence-disorder has remarkable effects on the 
properties of dsDNA [H [11, H HH [H, HI . Based on 
the elastic model, two effects of sequence-disorder have to 
be considered. First, structural inhomogeneity results in 
variations of the bending rigidity along the chain, and can 
be described by the s-dependent persistence length lp{s) 
[igl . [23 | . It has been demonstrated that for a long DNA 
chain without long-range correlation (LRC) in lp{s), this 
effect can be well accounted for by a simple replacement 
of the uniform persistence length L in the WLC model 
by a proper average of the lp{s) However, for loop 

formation in a short DNA chain this effect becomes com- 
plex because the looping probability of a typical filament 
segment is not a well-defined function of its length [loj . 
Secondly, the local structure of the dsDNA can be char- 
acterized b y th e sequence-dependent spontaneous curva- 
ture Ko(s) 113, [S [ii, EO, iJ, m il]. For short dsDNA 
chains, special sequence order may favor a macroscopic 
spontaneous curvature [1^, |27| |. On the other hand, 
for long dsDNA chains, the effects of hq{s) is depen- 
dent on the degree of correlation in basepairs. With- 
out correlation or with short range correlation (SRC), 
the effect can be also reduced into a renormalization of 
Ip in the WLC model [13, [H, HI, H^. However, with 
LRC, the simple correction to the Ip is invalid because 
the biopolymer develops a macroscopic intrinsic curva- 
ture [201 . Moreover, computer simulations suggest that 
the mean of ko(s), rather than the details of its distri- 
bution, determines the looping probability of a filament 
PH . However, all analytical approaches on the sequence- 
dependent effects are limited to specified properties and 
on a WLC-based model with vanishing intrinsic curva- 
ture and with weak or vanishing external force, a rigorous 
proof on the general elastic continuous model is yet elu- 
sive. Bearing in mind that many dsDNA possess macro- 
scopic intrinsic curvature [13, [2^ [l^, [23| , an analytical 
approaches on the general model is of special important. 
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II. MODEL 

Using s as variable, the configuration of a filament 
can be described by a triad of unit vectors {ti{s)}i=i,2,3, 
where = dr{s)/ds is the tangent to the center line 
r(s) of the filament, and ti and t2 are oriented along 
the principal axes of the cross section. The orientation 
of the triad as one moves along the filament is given 
by the solution of the generalized Frenet equations that 
describe the rotation of the triad vectors [ll|, [Hj 13' 
dti{s)/ds = —Yij^k^ijkUjj{s)tk{s), where e^-fe is the an- 
tisymmetric tensor, and {uji{s)} are the curvature and 
torsion parameters. 

The elastic energy of a filament with s-dependent spon- 
taneous curvatures C,i(s), C2(s), spontaneous twist rates 
C3 is) and persistence lengths ai{s) can be written as 

Pill [a 



E 



(1) 



where T is the temperature, ks is the Boltzmann con- 
stant, and L is the total arclength of the filament and is 
a constant so that the filament is inextensible. 

If Ci and ai are well-defined (i.e., without randomness) 
functions of s, a macroscopic quantity B is defined as the 
average with Boltzmann weights over all possible confor- 
mations, so is a path integral in the form [ll|, H^l 



B 



/P[^,]i3[{^,(s)}]e- 



(2) 



Function i?[{wi(s)}] represents different physical situa- 
tions. For instance, if i3[{a;i(s)}] = tj(si) • tfe(s2), we 
find the orientational correlation function between tj 
and tfc; if B[{ijJi{s)Y\ = jr^ — rgp, we obtain the end- 
to-end distance, where = r(L) and Vq = r(0); if 

B[{a;i(s)}] = (5(r — Jg^ tads), we get the distribution func- 
tion of end-to-end vector. The applied force can be eval- 
uated using this distribution function; if B[{ijJi{s)Y\ = 
5{tl — ro)(5[t3(i) — t3(0)], we find the looping probabil- 
ity. Note that i?[{cj,;(s)}] may be a very complex function 
of LOi{s), but its detailed form is irrelevant in this work 
since it is independent on a; and Ci ■ If both ends are free 
of external force, B represents the intrinsic property of 
the system. On the other hand, if wc fix both ends of the 
filament, we obtain quantity in the constant extension 
ensemble. 



III. THE EFFECTS OF THE 
SEQUENCE-DEPENDENT SPONTANEOUS 
CURVATURES AND TORSIONS 

We first consider the effects of the C,i{s) alone but leave 
Oi's as well-defined. For a biopolymer without correlation 
on Ci(*)j or with SRC but in the coarse-grained model. 



the distribution of Ci(s)j W^({Ci})j can be written as a 
Gaussian distribution with nonvanishing average 



W{{Q) = exp 



E 



(3) 



In other words, Ci(s)'s are delta correlated along the 
chain: 



([Ci(s) - w,o][0(s') - tjjo]) = -^S^iHs 



(4) 



In this case, we need to average over again for B so 



Br = 



im]wi{Ci}){ 



(5) 



where = J X»[w,]e-^ and = J V[Q]W{{Q}) are 
essentially Gaussian integrals so are independent of Q or 
uji but dependent on ai or ki, respectively. Now using 
the identity 



-U[a{uj-<:f+k{C-uJof]ds 



/l?[C]e"^-/''=(^~'^'')''^'' 



- / |["(s)-'^o]^o 



and exchanging the order in integral, we finally obtain 



(6) 



Be = 



JV[C,]Wm) {JV[uj,]e-'B[{u;,m 



Zui Zi^ 



^ 1 P[^,]i3[{w,(s)}] 



/P[C.]e-^iy({0}) 



-H 



/PK]e- 



H 



where 



1 ^ 
•^0 i=i 



(7) 



(8) 



and (ii = aiki/{ai + ki). Note that Eq. ([7]) is valid for 
any length, and even if ai, ki and coio are s-dependcnt. 
Comparing Eqs. ^ and ([7]), we reach the conclusion 
that the effects of Qis) can be incorporated into a model 
with well-defined mean spontaneous curvatures and tor- 
sion ujio as well as renormalized persistence lengths (3i. 
This conclusion agrees with what has been found in the 
special case with fci = fc2 = Ip and luiq — W20 = 0, namely 
that the randomness of the Ci{s) can be accounted for 
by replacing Ip with an effective persistence length l'^^ 



in the WLC model, where = l/lp + l 

A different form is obtained with a half-Gaussian dis- 
tribution of disorder on curvature, which yields I'l^^ 



l-nil- 



2 \|h>J^>) 



Our result also agrees with the 
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conclusion obtained from computer simulation that the 
mean spontaneous curvature, rather than the details of 
its distribution, determines the looping probability of a 
filament UM- We should note that the proofs in Refs. 
[TtI [Tsl [Tot are limited to the special case with oi — a2, 
wio = W20 — 0, and under weak or vanishing external 
force, but our proof is rigorous and generally valid. 

The next question is would it be possible to replace 
the nonvanishing u>io in the model by uJiQ = by renor- 
malizing further f3i7 The answer to this question de- 
pends on the situation. For the end-to-end distance of 
a very l ong filament free of external force, the answer is 
yes [nl, I12II2II . But the convergence to that limit is slow 
so the above replacement is poor for moderate length 
(from a few Ip to about 20 Ip) two-dimensional filaments 
[24| . When relating applied force and extension, such a 
replacement is also only reasonable at low force and large 

L EM. 



IV. THE EFFECTS OF THE 
SEQUENCE-DEPENDENT PERSISTENCE 
LENGTHS 

Now we consider the effects of the ai{s) alone but keep 
Q as well-defined. In this case, we assume that the dis- 
tribution of the Qi is half-Gaussian since < is mean- 
ingless: 



P{{ai{s)}) = exp 



> 0.(9) 



It is difficult to do an average over if is small. There- 
fore, we assume that at is far from zero, which is rea- 
sonable for semiflexible biopolymers such as dsDNA, so 
approximately we have 



V[aM{a^{s)}) 



2? [a,;] exp 



^ f b' 



, (10) 



and so Za is dependent on bi only. In this case. 



B. 



where 



1 
1 



V[aM{a^{s)}) 



V[LO,]B[{LU,{s)}]C[{L^,is)}], 



C[{a;,(s)}] = / 



p{{a,{s)})e- 



(11) 



(12) 



and = f T>[uji]e ^ is dependent on ai{s). Applying 
standard path integral methods leads to 



^^-.11- n n 



i=l,3j=l,N 



N/2 



(13) 



where e = L/N, and Uij = ai[{j — l)e] is the discretized 
ai{s). The form of Z[j makes it impossible to find a closed 
form for C[{ciJi(s)}]. However, if the distribution in ai is 
narrow, which should be the case when the molecules 
forming the different segments are similar such as ds- 
DNA, we can then replace the at in Eq. (fT3|) by ai, so 
Z'^ can be taken out of the integrand in C [see Eq. (|12l) ] 
and written as 



Z: « 1 V{uj,Y~^\ (14) 
where £1 = 2 X! / \o.i(uJi ~ C,if\ds. (15) 



As a consequence. 



1 



P[a.,]p({a.,(s)})e 



(16) 



Now using the identity 



h(a - of + a(cj - C)' 



n 2 



we obtain 



-Si 



B„ 



where 



1 ^ 



L r 



(17) 

(18) 
(19) 

ds(20) 



-£i 



■ibi 



Due to the existence of the term {uJi — Q)'^ in Eq. ([20)) . 
Ba is divergent if there is no constraint on uji. However, 
biopolymers cannot have infinite coi, so there is a cutoff 
for UJi. This cutoff should be large enough so that for 
the (uji — Ci)^ term we can remove the constraint on uji. 
Moreover, it was reported that for a dsDNA chain with 
64 trinucleotides, {[lp^is)-lp]'^) « 0.13/-^ [l^. This 
means that even for a short dsDNA chain, the distribu- 
tion of Ip is not very wide. It is therefore reasonable to 
expect that for a long semiflexible biopolymer, the distri- 
bution of fli's becomes very sharp and the 6i's are large. 
We can then expect that the (uJi — Q)'^ term remains small 
and can be neglected up to the cutoff of uJi. Consequently 
we have 



Br, 



/I?[c..]i?[K(.)}]e"g^ 



(21) 



Eq. (I2ip means that we can replace ai{s) by Oj. This 
conclusion agrees with the conclusion for the special case 
«! = a2, 03 = and ujiQ = [19|. However, for a short 
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biopolymer, the contribution from {uji — Q)'^ cannot be 
ignored, and the results tend to be divergent making the 
averages poorly defined functions of L, as was reported 
for the special case [lO']. 

From the above derivations, we see that it is not a 
simple task to study the combined effects of the sequence- 
dependence of Qi and Q because of the term (w.^ — ^i)'* and 
the fact that (3i is not a linear function of a;. However, 
when Eq. (|2T|) is valid, Eqs. (H)-® can be recovered 
with the replacement of by a^. 



V. CONCLUSION AND DISCUSSION 

In summary, we present a rigorous and general proof 
that for a biopolymer without correlation or with SRC 
on spontaneous curvatures and torsions d, the effects of 
sequence-disorder on can be incorporated into a model 
with well-defined mean Q [i.e. w^q] as well as renormal- 
ized persistence length, no matter how long the biopoly- 
mer and how large the external force may be. More- 
over, if the biopolymer is sufficiently long and has a large 
enough mean persistence length, the sequence-dependent 
persistence length ai{s) can be replaced by its mean 
Qi. Note that "semiflexible" in general means that the 
biopolymer has a sufficiently large a^, our above conclu- 
sions can be safely applied to long semiflexible biopoly- 
mers such as dsDNA. However, for a short biopolymer 
or for a biopolymer with small a,;, the effects of inhomo- 
geneity in ai{s) become very complex and tend to make 
physical observables very sensitive to the details of ai{s). 
Our derivations are quite general, so the conclusions can 
be applied to various conformal and mechanical proper- 
ties. 

We also should remind that our proof works only in 
the constant extension ensemble. But it is reasonable to 
expect that these conclusions also can be applied to suf- 
ficient long biopolymers since in this case the structural 
details must be immaterial. It has been known that the 
constant extension ensemble and the constant force en- 
semble may be inequivalent at finite L. For constant force 
ensemble, the same conclusion has been achieved at a 
special case with ai =02, ujio — ^020 = and under weak 
applied force 17[ , but a proof for the general case is not 
yet available. In constant force ensemble, we need to add 
a term, which is the contribution of the external force, 
into the energy, and the energy becomes [H, [H, [H, [lB| 



£ = 



1 ^ 

i=l 



ai{uji - C,if - Fcost 



ds, (22) 



where F = f /{kBT) and / is the applied force. 9 is the 



angle between force and the tangent of the central line of 
the filament and is a very complicate function of Wi. This 
force term makes dependent on ( and so renders the 
exchange of the order in integral [from Eq. ^ to Eq. ([7])] 
illegal. Therefore, whether the same conclusion is valid 
in the constant force ensemble for a short biopolymer is 
still an open question. 

Moreover, we should recall that SRC in this work 
means that with proper length scale, the distribution 
function is Gaussian. What is the proper length scale 
is not yet very clear. It has been reported that the most 
bendable DNA sequences are those that wrap around nu- 
cleosomes, and there exists a correlation in the way they 
are arranged. Along the DNA contour, AA/TT/TA din- 
ucleotides have a periodicity of about 10 basepairs and 
this is the signature of the region with high affinity to nu- 
cleosomes [25]. Therefore, a reasonable estimate of the 
proper length scale for dsDNA is about 10 basepairs. We 
do not consider systems with LRC in and/or ai in this 
work so it deserves further investigation. But we should 
point out that LRC in sequences is not the same as LRC 
in C.i. For instance, for a homopolymer, the correlation 
in sequences is 100%, but it can be described by constant 
(or vanishing) Q so can be regarded as no correlations in 
Ci since it corresponds to the limit case of Gaussian dis- 
tribution with vanishing variances. In the more general 
case, LRC in sequences tends to make neighbor sequences 
have similar bending so to develop a macroscopic intrin- 
sic curvature, and the local intrinsic curvatures may have 
only a small random deviation from its mean, it in turn 
leads to the SRC in Q , at least in the first approximation. 
As a consequence, many properties, such as the behav- 
ior of the end-to-end distance 2j| , of such a biopolymer 
can be well accounted for by a model with constant (or 
well-defined) spontaneous curvature. Finally, the method 
used in this work may be applied to some other similar 
systems, such as Hookian springs with random natural 
lengths, or a quantum harmonic oscillator with randomly 
moving centers, or a quantum planar rotor in a randomly 
rotating coordinate system. 



Acknowledgments 

This work has been supported by the National Science 
Council of the Republic of China under grant no. NSC 
96-2112-M-032-002, the Physics Division, National Cen- 
ter for Theoretical Sciences at Taipei, National Taiwan 
University, Taiwan, ROC, and the Natural Sciences and 
Engineering Research Council of Canada. 



[1] S.B. Smith, L. Firizi, and C. Bustamante, Science 258, [2] T.R. Strick, J.F. Allemand, D. Bensimon, A. Bensimon, 
1122 (1992). and V. Croquette, Science 271, 1835 (1996). 



5 



[3] P. Cluzcl, A. Lobrun, C. Holler, R. Lavery, J.L. Viovy, 
D. Chatcnay, and F. Caron, Science 271, 792 (1996). 

[4] S.B. Smith, Y. Cui, and C. Bustamante, Science 271, 
795 (1996). 

[5] O. Kratky and G. Porod, Reel. Trav. Chim. Pays-Bas 68, 
1106 (1949). 

[6] J.F. Marko and E.D. Siggia, Science 265, 506 (1994). 
[7] C. Bustamante, J. F. Marko, E. D. Siggia, and S. Smith, 

Science 265, 1599 (1994). 
[8] J. F. Marko and E. D. Siggia, Macromolecules 28, 8759 

(1995). 

[9] B. Fain, J. Rudnick, and S. Ostlund, Phys. Rev. E 55, 

7364 (1997). 

[10] A. Goriely and M. Tabor, Proc. R. Soc. London, A 453, 
2583 (1997). 

[11] S.V. Panyukov and Y. Rabin, Phys. Rev. E 62, 7135 

(2000) . 

[12] S.V. Panyukov and Y. Rabin, Phys. Rev. E 64, 011909 

(2001) . 

[13] D.A. Kessler and Y. Rabin, Phys. Rev. Lett. 90, 024301 

(2003). 

[14] Z. Zhou, P.-Y. Lai, and B. Joos, Phys. Rev. E 71, 052801 
(2005). 

[15] Z. Zhou, B. Joos, P.-Y. Lai, Y.-S. Young, and J.-H. Jan, 
Mod. Phys. Lett. B 21, 1895-1913 (2007). 

[16] H. Wada and R.R. Netz, Europhys. Lett. 77, 68001 
(2007). 

[17] P. C. Nelson, Phys. Rev. Lett. 80, 5810 (1998). 



[18] D. Bensimon, D. Dohmi, and M. Mezard, Europhys. Lett. 

42, 97 (1998). 

[19] Y. O. Popov and A. V. Tkachenko, Phys. Rev. E. 76, 
021901(2007). 

[20] J. Moukhtar, E. Fontaine, C. Faivre-Moskalenko, and A. 

Arneodo, Phys. Rev. Lett. 98, 178101 (2007). 
[21] S. Rappaport and Y. Rabin, Macromolecules 37, 7847 

(2004). 

[22] Giampaolo Zuccheri, Anita Scipioni, Valeria Cavaliere, 
Giuseppe Gargiulo, Pasquale De Santis, and Bruno 
Samor, Proc. Natl. Acad. Sci. U.S. A 98, 3074 (2001). 

[23] John van Noort, Thijn van der Heijden, Martijn de Jager, 
Claire Wyman, Roland Kanaar, and Gees Dekker, Proc. 
Natl. Acad. Sci. U.S.A 100, 7851(2003). 

[24] Z. Zhou, Phys. Rev. E. 76, 061913 (2007). 

[25] Eran Segal, Yvonne Fondufe-Mittendorf, Lingyi Ghen, 
AnnChristine Thastrom, Yair Field, Irene K. Moore, 
Ji-Ping Z. Wang, and Jonathan Widom, Nature 442, 
772(2006). 

[26] M. Dlakic, K. Park, J. D. Griffith, S. G. Harvey, and R. 
E. Harrington, J. Biol. Ghem. 271, 17911 (1996). 

[27] W. Han, S. M. Lindsay, M. Dlakic, and R. E. Harrington, 
Nature London 386, 563 (1997). 

[28] D.C. Khandekar, S.V. Lawande, and K.V. Bhagwat, Path 
Integrals Methods and Their Applications (World Scien- 
tific, Singapore, 1993). 



